Week 2: Cognitive Perspectives and Introduction to ggplot2

Emorie D Beck

Quick Review

Review

What are the core elements of ggplot2 grammar?

From last week:

  • Mappings: base layer
    • ggplot() and aes()
  • Scales: control and modify your mappings
    • e.g., scale_x_continuous() and scale_fill_manual()
  • Geoms: plot elements
    • e.g., geom_point() and geom_line()
  • Facets: panel your plot
    • facet_wrap() and facet_grid()
  • Themes: style your figure
    • Built-in: e.g., theme_classic()
    • Manual: theme() (legend, strip, axis, plot, panel)

Part 1: Proportions

Visualizating Proportions

  • Proportions are often important in our research
  • From describing sample-level differences to describing the frequency of behaviors / events / experiences, etc., we often reach toward describing amounts relative to the whole
  • But the goals we are trying to achieve are varied, which necesssitates the use of different graphics

Part 1: Agenda

  • We will cover X kinds of ways of visualizations, all of which were covered in your readings
  • We will cover both when to use them and how to create them
    • Pie Charts
    • Bar Charts (Stacked)
    • Bar Charts (Side-by-Side)
    • Bar Charts and Density Across Continuous Variables
    • Mosaic Plots
    • Parallel Sets

But First, Our Data

  • Today, we’ll use the teaching sample from the German Socioeconomic Panel Study (GSOEP)
  • GSOEP is an ongoing longitudinal panel study that began in 1984 (26 waves of data!)
  • ~20,000 people are sampled each year
  • Samples households in Germany
  • Has additional sub-projects (e.g., innovation studies, migrant panel, etc.)
  • The data are publicly available via application
# A tibble: 360,553 × 9
    year    SID marital chldbrth gender         yearBrth mortality   job   age
   <dbl>  <dbl>   <dbl>    <dbl> <dbl+lbl>         <dbl>     <dbl> <dbl> <dbl>
 1  1999    901       4        0 2 [[2] Female]     1951         0    NA    48
 2  1999   1202       3        0 2 [[2] Female]     1913         0    NA    86
 3  1999   1901       2        0 2 [[2] Female]     1948         0    NA    51
 4  1999   2301       2        0 1 [[1] Male]       1946         0    NA    53
 5  1999   2302       2        0 2 [[2] Female]     1946         0    NA    53
 6  1999   2501       2        0 2 [[2] Female]     1924         0    NA    75
 7  1999   2801       2        0 1 [[1] Male]       1947         0    NA    52
 8  1999   2802       2        0 2 [[2] Female]     1956         0    NA    43
 9  1999 910603       2        0 1 [[1] Male]       1959         0    NA    40
10  1999   2901       2        0 1 [[1] Male]       1932         0    NA    67
# … with 360,543 more rows

Pie Charts

  • You may be wondering if you should ever use a pie chart
  • The answer is, of course, it depends
  • Pie charts are great when:
    • What you want to visualize is simple (e.g., basic fractions)
    • You want to clearly emphasize proportion relative to the whole
    • You have a small data set

Pie Charts

  • In our data, we have a few variables that follow this, but we’ll focus on two:
    • marital status (4 groups)
    • gender (2 groups)
  • ggplot2 doesn’t specifically support pie charts
  • Why? Because it’s a layered grammar of graphics and an explicit function for it would be redundant with some of the built in coordinates
    • specifically, coord_polar()
  • So to make a pie chart, we’ll use geom_bar() + coord_polar()

Pie Charts

Improvements

Part 2: Probability